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^^ . Abstract The breakdown point in its different variants is one of the central notions to quan- 

P5 ' tify the global robustness of a procedure. We propose a simple supplementary variant which 

^ , is useful in situations where w e have no obvious or only partial equivariance: Extending 

the lOonoho and Huber I ( 119831) Finite Sample Breakdown Point , we propose the Expected 
Finite Sample Breakdown Point to produce less configuration-dependent values while still 
preserving the finite sample aspect of the former definition. 

We apply this notion for joint estimation of scale and shape (with only scale-equivariance 
["tI ' available), exemplified for generalized Pareto, generalized extreme value, Weibull, and Gamma 

distributions. 

hi these settings, we are interested in highly-robust, easy-to-compute initial estimators; 
^ to this end we study Pickands-type and Location-Dispersion-type estimators and compute 

■4.^ . their respective breakdown points. 

C/2 , 

Keywords global robustness, finite sample breakdown point, partial equivariance, 
rf-\ , scale-shape parametric family, LD estimator 
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Ti^^j- . 1 Introduction 

T-H ■ 

ly-s , In an industrial project to compute robust variants of OpVar, i.e.; the regulatory capital as 

f*~^ ' required in Basel II (2006) for a bank to cover its operational risk, we came across the 

f^ I problem of determining the (finite sample) breakdown point of certain considered proce- 

dures. Here operational risk is by definition "the risk of direct or indirect loss resulting from 
inadequate or failed internal processes, people and systems or from external events." 



This work was supported by a DAAD scholarsliip for N.H. 



X 

H 

C^ ' p. Ruckdescliel ■ N. Horbenko 

Fraunliofer ITWM, Department of Financial Mathematics, 
Fraunhofer-Platz 1. D-67663 Kaiserslautern 
and Dept. of Mathematics, University of Kaiserslautern, 
P.O.Box 3049, D-67653 Kaiserslautern 
E-mail: peterntckdescheKa'itwm. fraunhofer.de 
nataIiya.horbenko@itwm.fraunhofer.de 



These ext remal events, as motiva t ed by t he Pickands-Balk ema-de Haan Extreme Value 
Theorem (see lBalkema and de Haan I ( ll974h . |Pickandsl (11973)) suggest the use of the gen- 
eralized Pareto distribution (GPD) for modeling in this context. In an intermediate step this 

modeling involves estimation of the scale and shape parameters of th is distribution. To this 

end, se veral robust procedures have been proposed in the literature, see lRuckdeschel and Horbenkd 
( 1201(1 ) for a more detailed discussion. 

One of the quantities to judge robustness of a procedure is the breakdown point (see 
Definition l3.lt . In particular, we are interested in the finite sample version FSBP of this 
notion to be able to quantify the degree of protection a procedure provides in the estimation 
at an actual (finite) set of observations. 

It turns out that for our purposes the original definition has some drawbacks, as it de- 
pends strongly on the configuration of the actual sample. To get rid of the dependence on 
possibly highly improbable sample configurations while still preserving the aspect of a finite 
sample, we propose an expected FSBP, EFSBP, i.e.; to integrate out the FSBP with respect 
to the ideal distribution. 

We illustrate the usefulness of this new concept for scale-shape models by means of 
two types of robust estimators, quantile-type estimators (Pickands Estimator PE) and robust 
Location-Dispersion (LD) estimators as introduced by iMarazzi and Ruffieux (1999); for 
the latter type we study estimators based on the median for the location part and several 
robust scale estimators for the dispersion part: a (new) asymmetr ic version of the med ian 
of absolute deviations kMAD, as well as Qn and Sn from Rousse euw and Crouxl ( Il993h — 
combined to MedkMAD, MedQn, and MedSn, respectively. 

These estimators are meant to be used as initial estimators with acceptable to good 
global robustness properties for (more efficient) robust estimators afterwards. In particular, 
they can be computed without the need of additional (robust, consistent) initial estimators, 
which precludes otherwise promising alternatives like Minimum Distance estimators, for 
which we could have read off asym ptotic breakdown point values as high as half the optimal 
value from 



Donoho and Liul ( ll988l) . We have also excluded the method-of-median approach 



of IPeng and WelshI ( 1200 ll) . because in contrast to PE and MedkMAD, MedQn, and MedSn 



for this estimator in the GPD and GEVD case, no e xplicit calculations are poss i ble. W e 
have studied this approach in another paper, though dRuckdeschel and Horbenkd ( l2010l) ). 
and empirically found that in the GPD case its breakdown behavior is worse than the one of 
MedkMAD and MedQn. 

Our paper is organized as follows: In Section|2l we list our reference examples for scale- 
shape models, i.e.; the generalized Pareto, the generalized extreme value, the Weibull, and 
the Gamma distribution, as well as the Gross Error model which we use to capture deviations 
from the ideal model. In Section [3] we recall the standard definitions of the asymptotic and 
finite sample breakdown points ABP and FSBP and introduce the new concept of EFSBP in 
Definition l3.2l Section|4]then defines the considered estimators, i.e.; quantile-type estimators 
PE, and LD estimators MedkMAD, MedQn, MedSn. At these estimators, we demonstrate 
our new breakdown point notion in Section |5] giving analytic formulae for FSBP, ABP, 
and EFSBP in Propositions 15.11 \5?2\ and l5.3l together with some numerical evaluations of 
EFSBP at some reference situation and with simulation-based evaluations. Proofs for our 
results are gathered in Appendix lAl 



Remark 1.1 This paper is a part of the PhD thesis of the second author; a preHminary version of it may 
be found in Ruckdeschel and Horbenko (2010). 



2 Model Setting 

For no t ions o f invariance of statistical models and equivariance of estimators we refer to 
Eaton I ( 119891) : Given a measurable space (Q,^), a family of probability measures £^ de- 
fined on ^ is a statistical model. 

Notationally, we use the same symbol for the cumulative distribution function (c.d.f.) and the probability 
measure; we write F(x — o) to denote left and, correspondingly, +0 for right limits, and f " to denote the right 
continuous quantile function given by F^ (.v) = inf{t G K : F{t) > s}. 

Definition 1 Suppose a group G acts measurably on Q . Model 3^ is called G-invariant iff 
for each P ^ 3^, the image probability gP of P under group action g stays in 3^. 

For simplicity, we assume that g{P\ ) = g{P2) implies Pi = P2 for any two elements of 3^. In 
a G-invariant parametric model 3^ = {Pel 9 £ ©}, where is the parameter space, group 
G induces an isomorphic group G, acting on the parameter space with the identification 
g{Pe) = Pg(d)- In this situation, a point estimator t mapping 12 to is equivariant iff 

t{g{x))=g{t{^))- 



2. 1 Generalized Pareto Distribution and Other Scale-Shape Families 

We illustrate our concepts at scale-shape models; our reference example is the three-parameter 
generalized Pareto distribution (GPD) which has c.d.f. and density 

where x > ^ for <§ > 0, and jU < jc < ^ - 4 if <§ < 0. It has parameter 6 = (i§,jS,/i)^, for 
location jU, scale jS > and shape E,. Special cases of GPDs are the uniform (E, = —1), the 
exponential (E, = Q, pt = 0), and Pareto ((§ > 0, jS = 1) distributions. We limit ourselves to 
the case of known location jU = and unknown scale and shape here and abbreviate the pair 
(jS,<§) by ■&, i.e.; we are concerned with joint estimation of z> = (jS,<§) only. 

Other scale-shape families for which our considerations apply mutatis mutandis are the 
generalized extreme value distribution (GEVD) given by its c.d.f. 

PeW=exp(-(^l + ^^) ^)l(_4+^,.)W (2-2) 

the Weibull distribution with density 

/^W = |(^)^"'exp(-CVjS)^)l(0HW (2.3) 

and the Gamma distribution with density 

M-^) = ^g|7^exp(-(.V/3)) 1(0,00) W (2.4) 

For the Weibull and Gamma case we require <§ > 0, whereas in the GEVD case the same 
distinction applies as in the GPD case. 



Reparametrization In the Weibull family, passage to the log-observations transforms this 
model into a location-scale model with the st andard Gumbel as central distribution. This 
approach has been taken bv lBoudt et all ( 120111) , and allows them to recur to the rich theory 
(both classical and robust) available for location-scale models. 

In both GPD and GEVD, a similar approach is possible, once instead of p. we use p. = 
fJ.^ — jS , so that in this setting we get 

In the GPD case, this leads to a location-scale model with the standard Exponential as cen- 
tral distribution. This par ametrization is used for two-parameter Pareto distribution, e.g. in 
iBrazauskas and Serfling I (|20QQ). Two issues, however, are bought with this approach: First, 
knowledge of ^ is not the same as knowledge of p., so our original setting where ^ was 
assumed known does not carry over easily. Second, the corresponding transformed model 
about the Exponential distribution is not smooth — ^L2-differentiable to be precise. The rea- 
son for this is essentially that observations around the left endpoint of the distribution carry 
overwhelmingly much information about the location parameter. As a consequence, usual 
optimality theory no longer is available, and in the ideal model setting there are estimators 
which are consistent at faster rates than the usual l/-y/n. On the other side, this high accuracy 
requires to base inference essentially completely on the minimal observations which makes 
these procedures extremely prone to outliers. Robustifications avoid this problem, but still, 
due to the lack of smoothness no optimality theory is available. For this reason, we stick to 
the original parametrization. 

Our reference model In the sequel, we use the reference values jS = 1 and <§ = 0.7 for all our 
scale-shape models; in case of the GPD this amounts to moderately fat tails which reflects 
well the situation we met in our application to OpVar. 

In-Zequivariance The reduced model enjoys a certain invariance: with an included scale 
component, it remains invariant under scale transformations sa (x) = jSx of the observations. 
Using the matrix da = diag(jS, 1), this invariance is reflected by a corresponding notion of 
equivariance of estimators, i.e. ; an estimator 5 for iJ = (jS , i§ ) is called scale-equivariant if 

S{iix\ ,..., jS.Y„) = dpS{xi ,...,x„) (2.6) 

For the shape parameter ^ , there is no obvious such invariance, entailing a dependence 
of estimator properties like robustness on this parameter. 



2.2 Gross Error Model 

Extending the ideal model setting. Robust Statistics defines suitable distributional neighbor- 
hoods about this ideal model. In this paper, we limit ourselves to the Gross Error Model, 
i.e.; as neighborhoods, we use the sets of all distributions F'' representable as 

f"''= = (l-e)F"' + eF'" (2.7) 

for some given size or radius e > 0, where F"' is the underlying ideal distribution and F''' 
some arbitrary, unknown, and uncontrollable contaminating distribution. 



3 Global Robustness: the Breakdown Point 

In this paper we focus on the Breakdown Point as a global measure of robustness, specifying 
the reliability of a procedure under massive deviations from the ideal model. In the gross er- 
ror model ( 12.7b . it gives the largest radius £ at which the estimator still produces meaningful 
results. 

In standard literature on Robust Statistics, there are two notions of breakdown point — 
the asymptotic (funct ional) breakdow n po int (ABP) and the finite sam ple breakdown point 
(FSBP) introduced in lHampel I ( 1 19681) and lDonoho and Huber I ( Il983h , respectively: 

Definition 3.1 (a) JHamvel et al \ \l98d 2.2 Definition I) The asymptotic breakdown 
point (ABP) e* of the sequence of estimators T„ for parameter £ at probability F is 
given by 

£* := sup < e e (0, 1]; there is a compact set K^ C @ s.t. 

k{F,G)<£ =^ G{{T„eK,})'^l^ (3.1) 

where Tt is Prokhorov distan ce. 

(b) nHampel et al [ l798a 2.2 Definition 2) The finite sample breakdown point (FSBP) £* 
of the estimator T„ at the sample (xi , ...,x„) is given by 

e*[Tn\xi,...,Xn) := -maxim; m&y. sup |7^,(zi,...,z„)| < <=°l, (3.2) 

n '- '1 'raVl v,„ J 

where the sample (zi , . . . , z„ ) is obtained by replacing the data points x/j , . . . , x/^^ by arbitrary 
values yu...,ym. 

Note that £* from i lT2l l is by l/n smaller than the lDonoho and Huberl l l 19831) FSBP. 
Definition |3.1 (b)| does not cover the scale case, where we must take into account the possi- 
bility of implosion as well: As noted by an anonymous referee, otherwise one could achieve 
arbitrarily high breakdown points by choosing estimators based on two very low quantiles, 
which of course would not be stable at all — an argument valid in the location-scale case as 
wel l . A re medy for the scale parameter is given by the log-transformation as mentioned in 
life] ( l2005l) . i.e.: 

e*{T„;xu...,x„):= -maxim; max sup \log{T„{zu---,z„))\ < °°\ , (3.3) 

n >■ 'l,...,'myi,...,v,„ J 

Breakdown and partial invariance By arguments given in lOavies and Gather I ( 120050 . a cer- 
tain equivariance of the considered estimator under a suitable group of transformations is 
required to obtain meaningful upper bounds for the breakdown point. In our scale-shape 
models, however, as indicated in Section IZTl we canonically only have scale invariance. 
This lack of complete equivariance does not invalidate the cited authors' considerations, but 
rather these can be extended to also cover this partial invariance: 

While due to the lack of shape-equivariance, we conjecture that similar defective con- 
structions, which produce breakdown points arbitrarily close to 1 in the AR(I) case (as men- 
tioned in|Genton and Lucas (2005)), should be feasible in the pure shape case as well, in 
the joint scale-shape case, imposing scale-equivariance, we do obtain sensible upper bounds 
as such constructions are eliminated by this (partial) equivariance. 



In particular, as the scale model is a submodel of our scale-shape model, the correspond- 
ing u pper bounds for the maximal breakdown point among all scale-equivariant estimators 
from lOavies and Gather 1 1 12005! , Thms. 3.1,3.2) remain valid in our setting without change. 
Hence, in the sequel, we restrict o urselves to scale-equivariant estimators. In particular, fol- 
lowing |Davies_and_GatherJ OOOTI . sec. 4.2), we note that with no being the highest frequency 
of a single data point in the original sample, 

(adapted to ( I3.2| |) among all scale-equivariant estimators. 

Breakdown and restricted parameter space In the GPD and GEVD families, there are two 
canonical parameter spaces for ^: Either one does not impose any restriction, i.e.; ^ 6 M — 
which could be seen as "natural" there, or one restricts <§ to be positive (which is the only 
possibility for the Weibull and Gamma case). 

In the GPD and GEVD case, <§ = is a discontinuity as to the statistical properties of the 
model, comparable to parameter values ±1 in the AR(1) model. While GPD and GEVD for 
^ < have compact support, in the AR(1) model ±1 mark the border of stationarity. In both 
cases, the discontinuity only becomes visible when passing to sequences of observations, in 
our case when motivating GPD and GEVD by asymptotic arguments, i.e.; by the Pickands- 
Balkema-deHaan and Fisher-Tippet-Gnedenko Extreme Value Theorems. To this end we 
need a uniformity over sets of quantiles which gets lost when passing over the value <§ = 0. 
In particular, shape in the GPD and GEVD models decides to which domain of attraction 
belongs the underlying distribution in the corresponding Extreme Value Limit Theorems. In 
both the scale-shape an d the AR(1) case, it is hen ce well debatable to r estrict the parameter 
space accordingly, see iGenton and Lucas I ( 120051) and the rejoinder in iDavies and Gather I 
([20051, p. 1033). E.g.; we are mainly interested in the case when ^ > 0, which corresponds 
to heavy-tailed GPD / GEVD, and an estimate <§ < would lead to drastic under-estimation 
of the corresponding operational risk. 

In the sequel, for the GPD and GEVD cases, we hence consider both situations: with 
and without restriction on the parameter space, i.e.; that i§ > or <§ e M. 

Similar arguments could be carried out in case of shape estimation in the Weibull case, 
where < i§ < 1 coiTesponds to heavy-tailed, ^ > 1 to light-tailed distiibutions; we do not 
pursue this further here. 

Breakdown and finite samples As for our purposes, reliability at finite samples is of primary 
interest, we will focus on the FSBP 

For deciding upon which procedure to take before having made observations, in par- 
ticular for ranking procedures in a simulation study, the FSBP from Definition |3.1 (b)| has 
some drawbacks: It is deliberately probability-free and based on an actual sample {x\ , ...,jc„), 
which we assume from the ideal situation for the moment. Hence its value depends on the 
configuration of this sample. This is desirable when checking safety of a procedure at an 
actual data set, but also entails that for the estimators considered in this paper, a generally 
valid value for FSBP does not exist, and the only possible universal lower bound will be 
the minimal possible value of 0; and even if we made a sample-wise restriction, banning 
such samples from the application of the estimator, we would have other ones to come up 
with an FSBP of 1/w and so forth. This does not reflect the situation to be expected in 
the ideal model, though. Hence, we follow the general spirit of robustness to tie robustness 
concepts to a central ideal probability model — compare Definition |3.1 (a)[ To get rid of the 



dependence on possibly highly improbable sample configurations leading to an overly small 
FSBP, but still preserving the aspect of a finite sample, we propose an expected FSBP: 

Definition 3.2 For an estimator T with FSBP e* = e*{T\X\,...,X„), we define the expected 
FSBP or EFSBP as 

e:{T):=E£:{T;Xi,...,X„) (3.5) 

where expectation is evaluated in the ideal model. 

At some places, if existent, for a sequence T of estimators T„, we also consider the limit 

r(r):=lime;(7^,) (3.6) 

and which, for brevity, we also call EFSBP where unambigous. 

Admittedly, the evaluation of the expectation in ( 13.5b in general assumes knowledge of 
the parameter, but some vague prior information could be used to restrict the range of the 
plausible parameter values, say to i§ £ (0.5;2), and take the worst behavior of e,*(r) on this 
range to base our decisions on, compare, e.g. Figure[2l 

Weighted by their (ideal) occurrence probability, by this definition, improbable sample 
configurations of the ideal sample — before contamination — are smoothed out in EFSBP; 
we still cannot exclude these configurations, but usually by corresponding Chebyshev-type 
inequalities for growing sample size n these will occur with decreasing probability and e* 
will concentrate about e* . Hence, in practice, without extra knowledge, a priori, the user 
can rely on being protected against up to E*{T)n outliers on average; i.e.; although there 
may be (rare) cases where we have considerably less protection, these cases are balanced by 
corresponding cases with considerably stronger protec tion. 

By averaging, EFSBP is closer again to the ABP of lHampel I ( Il968r) . but preserves the fi- 
nite sample aspect of FSBP. In the examples, we will show that this aspect is non-negligible, 
and that for sample sizes about 40, the ABP will still be somewhat misleading (see Table[2] 
and Figure [3] below), while at the same time, as mentioned, FSBP will be way too pes- 
simistic. By dominated convergence though, the limit of EFSBP will coincide with the ABP 
whenever the FSBP converges to the ABP. 

Small values of e* for particular samples do not only occur in the models discussed 
here: In the one-dimensional normal scale model, we can already have FSBP of for the 
median of absolute deviations MAD for large enough values of no as introduced before 
( 13.4b . Such events (and similarly extraneous sample configurations), however, occur with 
probability in a continuous setting. Otherwise, in situations where a FSBP of could 
occur with positive probability in the ideal model, necessarily we have mass points violating 
the standard smoothness assumptions usually required in scale models: the c orresponding 
Fisher information of scale would be infinite then, compare lRuckdeschel and R ieder ( 2 01CT) . 
and one may then rather question the use of MAD. In our case, this is somewhat different, as 
without arbitrary restrictions on the sample space, samples with FSBP of can occur with 
small but positive ideal probability (see po in Table |2]i, although our model remains smooth 
(and Fisher information finite). 



4 Robust Estimators Types 

We illustrate the concept of FSBP in our scale-shape models for Pickands-type and LD-type 
estimators, as defined in the sequel. 



4. 1 Pickands Estimator 



Pickands estimator (P E) for GPP is a special case of the Elementary Percentile Method 
(EPM) as discussed bv lCastillo and Hadi I ( 11997) for GPP. S uch estimators are based on the 
empirical quantiles, in our case, we follow IPickands I ( 1 19751) and use the empirical 50% and 
75% quantiles Q2 and Q^. Pickands estimators for ^ and j3 in GPP model then are defined 
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Ql 



63-262 



(4.1) 



where we see that for jS > we have to require Q2 > 2Q2, in which case <§ > automatically. 
Apparently PE is equivariant in the sense of I l2.6t . 
For GEVP, analogue estimates can be obtained by 
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where qo is obviously smooth, and, if plotted, easily seen to be strictly isotone, compare 
Figure [T] in particular, | > iff 63 > 62(1 +'?o(0)) = 3.3962, and jS > iff 63 > 262- 



qo(y 




Fig. 1 ijo ( I? ) for different values of t, ; note tlie logaritlimic y-scale 



In the Weibull model. iBoudt et al I ( 1201 ih have shown Pickands (quantile) estimators to 
have an explicit representation as 

- /m (3/4) -/n (1/2) . . ,,g 

^ = -^T^i7rv-rT7k^^ iS = e2/(-iog(i/2))'/« (4.4) 

iog(e3)-iog(e2) 

where /f/ (a) = log(- log(l - a)). 

For the Gamma distribution the quantile estimates have no closed solutions, so the 
matching of empirical and theoretical quantiles is to be done numerically by root solving 
procedures. 



4.2 MedkMAD and other LD estimators 

Location-Dispersion estimators, introduced bv lMarazzi and Ruffieux 1 1 119991 ). match empir- 
ical location and dispersion measures of data against their population counterparts to get the 
estimates of model parameters, and are applicable for asymmetric location-scale (Lognor- 
mal), as well as in scale-shape models (GPD, Pareto, Weibull, Gamma). 

Let Q = (a,CT) be a parameter vector, F„, Fa.a empirical and model distribution func- 
tions, m{F„), s{F„), m{Fa.a), s{Fa.a) corresponding empirical and model location and dis- 
persion, then LD estimators (d, ct) are solutions of 

1) a/n(Fo.i) -Fd = m{F„), CTj(Fo,i) = s{Fn) 
when a is a location parameter, 

2) OmiFaA) = in{F„), as^Fa.i) = s{F„) 

when a is a shape parameter. 

Efficiency and robustness of these estimators depend on the choice of m(-) and s{-), and, 
of course, on the respective parametric model. Mean and standard deviation are classical 
measures for location and dispersion, respectively. Robust alternatives are median, trimmed 
mean — for location, IQR, MAD, trimmed MAD, Sn, Qn — for dispersion. In addition, for 
asymmetric distributions, we propose a new dispersion measure, namely kMAD. Table [T] 
displays different variations for LD estimators with increasing efficiency together with cor- 
responding references. 

Definitions of some particular LD estimators Empirical median m = m„ and median of ab- 
solute deviations M = M„ are well known for their high breakdown point, jointly achieving 
the highest possible asymptotic breakdown point of 50% among all affine equivariant esti- 
mators at symmetric, continuous univariate distributions. 

Hence it is plausible to define an estimator for <§ and jS, matching m and M against their 
population counterparts m and M within a scale-shape model. It turns out that the mapping 
(j3,i§) I— >■ (m,M){F^) is indeed a Diffeomorphism, hence for sufficiently large sample size 
n, we can solve the implicit equations for jS and E, to obtain the MedMAD estimator. 

More efficient estimators for dispersion than MAD, but with same breakdown point 
of 50% at continuous distributions, and in particula r suitable for asymmetric distributions, 
have been proposed in lRousseeuw and Crouxl ( 119931) as M = Q„ and M = S„. In this context. 



unchecked credit given to lOlive N2006I) in the cited reference 
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Table 1 LD estimators and literature of using for scale-shape models 



Qn = {[ 



'<J}{k), 



(")/4, h = [«/2J + 1, while S„ = med,{medj|;c,' — Xj\} 



where in case of discrepancies, the inner median is to be taken as hi-med, the outer as lo-med, 
where lo-med(F) = F^(l/2), and hi-med(F) = F^(l/2 + o). The resulting LD estimators 
are named MedQn and MedSn, respectively. 

Note that for asymmetric G, the functionals 5(G) = medx medyjX — Y|, X, Y ~ G 
and Q{G) = mi{s > 0; J G{t + d^^ s)dG{t) > 5/8} involve expensive, careful numerical 
calculations, in particular for the heavy-tailed GPD and GEVD cases. 

In the GEVD and GPD case, due to their considerable skewness to the right, one can 
improve the MedMAD estimator considerably, using a dispersion functional that takes this 
skewness into account: For a distribution F on R with median m let us define for A^ > 



kMAD{F,k) := inf { f > 1 F{m + kt) - F{m - ?) > 1/2 } 



(4.5) 



i.e.; kMAD only searches among the class of intervals about the median m with covering 
probability 50%, where the part right to m is fc times longer than the one left to m and returns 
the shortest of these. In our case, k would be chosen to be a suitable number larger than 1 , and 
k= I would reproduce the MAD. Apparently, whenever F is continuous, kMAD preserves 
the ABP of the MAD of 50%, i.e.; covering both the explosion and implosion case. 



Computation of LD estimators Each of our dispersion estimators Sn, Qn, and kMAD is 
scale-equivariant, and the same also holds for the respective population counterparts, as 
well as for any fixed quantile, in particular for the median; hence denoting the dispersion 
functional by s, both the quotient q{^) := j'(jS,<§)/hj(jS,<§,) and its empirical counterpart ^„ 
Ujk,^k;n for MedkMAD) are scale-free; so we have reduced the problem by one dimension. 
In the sequel we also write q^, qt,n for Sn and Qn, where k is then simply void. Assuming 
continuity and monotonicity, we obtain an estimator for ^ given by ^„= q^ {%i,k)- 

A corresponding estimator for jS for each of the variants kMAD, Sn, and Qn, is then 
simply given by 



jS„ = /«/'«(!, I«) 



(4.6) 



In particular, by construction all LD estimators are equivariant in the sense of i l2.6b . 
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Continuity and Monotonicity of q' as a function in E, ensure existence and uniqueness of the 
implicitly defined estimator for E, . 

Continuity of q^ in <§ for all our scale-shape models, i.e.; GPD, GEVD, Gamma, and 
Weibull and all our dispersion functionals kMAD{k), S and Q is straightforward, even for 
the limit cases (§ — )■ 0. 

Monotonicity of cjii, though, is not so obvious from the analytic terms, but the plots of 
function <§ i— )• q{^) for dispersions kMAD, Sn, and Qn, in Figure |4] indicate strict mono- 
tonicity for each of the dispersions and the GPD, Gamma, and Weibull cases, while for the 
GEVD case, q is bitone with maximum ^j. taken in ^q > 0. To obtain consistent estimators 
in this case, we restrict ourselves to the range left or right to <§o containing <§ = 0.7 in this 
paper. 

Restriction(s) of solvability domain Besides this restriction of the range of <§ in the GEVD 
case, we conclude, that in the GPD and in GEVD cases, for each of the dispersions, our 
restriction to i§ > implies a restriction of the solvability domain for qic{^) with in the set 
of admissible values of ^ : 

q,{^)>limq,{^)^:q,>0 (4.7) 

while in the Weibull and Gamma case, qi^ can be taken as 0. 
The following lemma gives us yet other restrictions: 

Lemma 4.1 Let s the functional version to any of the scale estimators Sn, Qn, and kMAD 
(for any k > 0). Let G be a distribution on R such that —°° < xq = sup{j:: G{x) = 0}, i.e.; 
with finite left endpoint. Then with m = G^ (1/2 + o), the hi-med ofG, 

s{G)<m-XQ=:so (4.8) 

with equality iff 

(kMAD) G{{m;m + ksQ}) = 0. 

(Sn) G{x + 2so — o) — G{x) < 1/2 for each x > xq. 

(Qn) G(m) = 1/2, G{xo) = 0. 

Consequently, as xq = 0, in the GPD, Gamma, and Weibull case, 

qk{^)<l V^ (4.9) 

and, the same relation in the ideal model also holds sample-wise, i.e.; 

qt„<l-.qk (4.10) 

in each sample (from the ideal model distribution) where 

(kMAD) at least one observation in (m; m + k{in — Xn))) . 

(Sn) at least one interval of length shorter than 2(m — ^(i)) containing more 

than [«/2j -|- 1 observations. 
(Qn) all observations finite. 

Hence, for the LD estimators, we have to find the unique zero ^„ ofHk{B,) =qk{^) — qn.k 
in the interval {qk; qir) which can easily be solved with a standard univariate root-finding tool 
like uniroot in R Jr Development Core Team LI201 ih . 
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Producing breakdown Clearly, in the GPD case, we could drive ^^ „ to values larger than 
1 by modifying observations in the original sample to values smaller than xq. These values 
would then be identifiable as outliers without error then, and we could cancel them from the 
sample. Instead we only consider contaminations by values larger than xo (which could also 
have been produced in the ideal model). 

On first glance, values of qi,„ outside {qk,qk) would make for a "definition breakdown", 
but if, for s„ the respective scale estimator, s„ — )■ m, this entails (§„ — s- 0° in the GPD case and 
^„ ^ in the Gamma and WeibuU case. Hence we can produce a breakdown in the original 
sense by modifying an original sample such that s„ —5- m. 



5 Calculation of (E)FSBP for Pickands and LD Estimators 

In some of our scale-shape models and for some of our estimators we have analytic expres- 
sions for the different breakdown point notions. 



5.1 Pickands Estimator 

Proposition 5.1 (Breakdown for PE) In the GPD, GEVD, WeibuU, and Gamma cases, an 
upper bound for FSBP of PE is given by 25%, which also invariably is the FSBP in the 
WeibuU case. In the GPD case, no matter if^ &M. or ^ > 0, and in the unrestricted GEVD 
case, i.e.; ^ &M., FSBP is given by 

£n=N°/n, for N^,:=#{Xi\2Q2<Xi<Q3}. (5.1) 

The ABP then is given by 

r = E*=P^{2Q2<Xi<Q3) (5.2) 

which in the GPD case is just e* = (2^+^ — 1)"'/^ — 1/4, and, in the GEVD case, e* = 
3/4 - exp f - (21og(2)^ - l) "''''^ V In the restricted GEVD case, where <§ > 0, 

e;=A?,?/«, for N°:=#{Xi\qo{0)Q2<Xi<Qi}. (5.3) 

The ABP then is given by 

e* = E*=Pi^{qo{0)Q2<Xi<Qi). (5.4) 

For <§ = 0.7, we obtain e* = 6.42% in the GPD case, and in the GEVD case, e* = 15.42% 
in the unrestricted case, and e* = 6.13% in the restricted case. For the figures for e*, for 
n = 40, 100, 1000 in the GPD, GEVD, and WeibuU case, see Table [3] where we make use 
of Proposition 15.31 below. In the Gamma case, the situation is more involved, and we skip 
computation of the actual breakdown points. 
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5.2 LD Estimators 

The FSBPs of 50% of the median and the dispersion estimators obviously form an upper 
bound for the FSBP of the LD estimators, implying that you could at least drive one of the 
parame ters [5 and £, to °°. H owever, similarly to regression based estimators for the WeibuU 
case of boudt et al I ( 120 llh . breakdown is not only entailed by moving mass to or °o, and 
the actual breakdown points of the LD estimators are smaller; for the MedkMAD, we come 
up with some explicit expressions, while for the MedSn and MedQn we have to recur to 
simulations, see Subsection l5.5l 

Proposition 5.2 (Breakdown for MedkMAD) In the GPD, WeibuU, and Gamma cases, 
the FSBP of MedkMAD is given by 



(N'Jn 

\min{N'„,N!:)/n 



WeibuU; Gamma; GPD, unrestr. case, i.e.; (§ 6 M 
GPD, restr. case, i.e.; i§ > 

A?;, := #{Xi \m < Xi < {k+\)m}, (5.6) 

A^' := r«/2] - #{Xi I (1 - qk)rn < X, < {kqt + \}m}. (5.7) 

The ABP in this case is given by e* = s' for the unrestricted and e* = min(e', e") for the 
restricted case where 

e'=F^((^+l)m)-l/2, E" = l/2-Fi,{{kqk + l)m)+Fi^{{l-qk)m). (5.8) 

At fe = 10 and <§ = 0.7, we obtain e* = 44.75% (GPD; <§ e M), 11.87% (GPD; <§ > 0), 
49.47% (Gamma), and 47.56% (WeibuU). For further figures fo r £*,£*,£, see Tabl e[3l where 
again we make use of Proposition l5.3l In particular, contrary to lBoudt et al 1 1 1201 ih , not only 
is our FSBP varying sample-wise in these cases, but also do ABP and EFSBP depend on <§ . 
A plot of the dependency <§ i-)- e*(MedkMADio;GPD(<§)) is displayed in Figure|2l 



5.3 Calculation of EFSBP 

To obtain actual values of EFSBP, we have the following proposition. 

Proposition 5.3 Consider N^,, N'„, N" as defined in (15.71 ). (15.61 ). (15.71 ) and write F for l—F. 
Then for n > 3, 

(a) setting ii = \_n/2], ('2 = r3«/4], and abbreviating 2F^'(m) by t2, we obtain for I 6 
{l,...,!2-!l-l} 



(iV^ = Z) = «/^\,_,,,::/,_,_,)"""' (^(^2) - uf--'^-'-'F(t2r'^-+'^' du (5.9) 



and 



P{Nt = 0) = «"f /'(,_,",;!,+,)«"-' (F(^2) -«)'-"+'F(r2)«-'-' J«. 
/=o -"1 



(5.10) 



The case ofNj^ is obtained from (15. 9I >, (I5.7QI > replacing t2 by tq := qQ{())F ' (u). 
(b) using the hi-med and setting t^ := {k+ \)F^ (11), we obtain for I 6 {0, . . . , [w/2] — 2} 



P{N'„ = l)=n (^„;^j|i,)«"/2(F(f,)-M)'F(r,r/2-i-'j« (5.11) 
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ABP{MedkMADio) in % at GPD(1 ,t,) 



w/o restriction, ^ e R 
witti restriction § > 




-202468 

siiape t, 
Fig. 2 e*(MedkMADio;GPD^^(i^)) for different ^ with or witliout restriction ^ >0 



(c) setting t^ := {\+kqii)F ^{u),t-:={l — qii)F ^{u),weobtainforl€{0,...,n/2—l} 



p(^' = «/2-/) = «£ u 



/2=0 



'/2-I2-IJ2J-I2) 



F(r_) 



1/2-/2-1, 



■F(^_))'^ 



x(F(r+)-M)'-'^(l-F(r+)) 



W7/2+/2-/ 



cfM. 



(5.12) 



The dependency of EFSBP on n is visualized in Figure[3] We see a saw-tooth like oscillation 
which is explained by the use of finite sample quantiles in Proposition l5.3l In particular there 
are considerable deviations from ABP for moderate sample sizes. 



5.4 Illustration: Usefulness of EFSBP 



The expressions given in Propositions 15. II l5T2l and [53] illustrate that in both the Pickands 
and LD estimator case, even starting from an ideal sample, the "usual" sample-wise flucta- 
tions of FSBP = N„/n are considerable. Moreover, Proposition l5.3l shows that we even have 
a positive, although very small ideal probability 



Po 



:P^{N„=0)>0 



(5.13) 



for breakdown already in the ideal model. Now, on the event {N„ — 0}, e* = 0, so no uni- 
versal non-trivial lower bound can be given for the FSBP in both the Pickands and LD 
estimator case. As the figures in Table|2]below illustrate, however, such an event will hardly 
ever occur provided only moderately small sample sizes, and the same goes for similarly 
small realizations of N„, so these cases, as motivated in the introduction of EFSBP, are not 
representative, indeed. To grasp the difference between e* and e*, we consider the following 
Hoeffding-type lemma for empirical quantiles 
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EFSBP for MedkMADio and PE in % at GPD(1 ,0.7) 



D. 
03 




Fig. 3 e,* for PE and MedkMADio at GPDji q 7, (restricted to 1^ > 0) as a function in n 



Lemma 5.4 (a) Let < 5 < 1/2 and f 6 R and for given a e (0, 1) and cdf F, let q = 
F^(a), and q„ = F^{a). Assume that F is dijferentiable in q with density f{q) > 0. 



Then with t„ = tn 



-1/2+5 



, for n large enough, 

P{\qn-q\ >t„) <exp( 



-mqfn') 



(5.14) 



(b) Let fl; ^0, O!; 6 (0,1), CC\ ^ CC2i= 1,2 be given as well as cdfF; assume F differentiable 
in aiqi, i = 1,2. Then under the assumptions of (a) for qi, for I„ = {a\q\„,a2q2.n) cind 
I = {aiq\\a2q2), we have for n large enough. 



P^{I„)=P^{I)+0(n 



-1/2+6/2 



(5.15) 



To illustrate the size of the ©(n^'/^^ '~)-term, let us also determine the upper pi-quantile 
of e* for pi = 0.95"'"'^', i.e.; the minimal number q\, such that with probability 0.95 we 
will not see realizations with e* < q\ in 10000 runs of sample size n. 



Evaluations for PE and MedkMAD Using the actual distribution of N„ given in Proposi- 
tion |53] in Table [2l for Pickands (PE) and MedkMAD, A: = 10 we determine e*, pQ and 
q^i for n = 40, 100, 1000 in the GPD (with and without restriction to <§ > 0), Gamma, and 
Weibull cases, each with E, = 0.7. The Gamma case is skipped, though, in the PE case for 
lack of explicit formulae. Apparently e* is quickly converging in n, so e* gives indeed a 
useful bound on average. 

According to the values of pQ, breakdown in the ideal model will hardly ever happen for 
PE for n > 1000, and for MedkMAD for n > 100, and only rarely for n > 40. 

The values for qi demonstrate that in a simulation study at the GPD with ^ = 0.7 with 
10000 runs of sample size upto n = 1000, we will probably see breakdowns for PE, as well 
as for the MedkMAD restricted to <§ > 0. Contrary to this, as long as we have no more 
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GPD 










estimator 


«=10 


» = 40 


n = 100 


n = 1000 


n = oo 


PO 


PE 

MedkMAD, ^ G R 
MedkMAD, ^ > 


5.1e-01 
3.3e-04 
1.4e-01 


2.7e-01 
1.6e-15 
3.5e-02 


7.9e-02 
7.2e-38 
2.7e-03 


5.4e- 08 
< le-300 
2.9e-018 







ql 


PE 

MedkMAD, ^ G M 
MedkMAD, ^ > 


0.00% 
0.00% 
0.00% 


0.00% 

20.00% 

0.00% 


0.00% 

30.00% 

0.00% 


1.00% 

41.10% 

5.70% 


6.42% 
44.75% 
11.87% 


e,t 


PE 

MedkMAD, ^ 6 M 
MedkMAD, ^ > 


6.44% 
35.85% 
18.37% 


5.26% 
42.53% 
13.45% 


5.78% 
43.86% 
12.48% 


6.34% 
44.66% 
11.94% 


6.42% 
44.75% 
11.87% 









GEVD 










estimator 


n=10 


n = 40 


n = 100 


n = 1000 


n — oo 


PO 


PE. ^ G K 
PE. ^ > 


2.8e-01 
5.4e-01 


3.8e-02 
3.7e-01 


6.8e-04 
2.0e-01 


8.2e-28 
5.0e-04 






1i 


PE. ^ G K 
PE, ^ > 


0.00% 
0.00% 


0.00% 
0.00% 


0.00% 
0.00% 


9.10% 
0.00% 


15.42% 
6.13% 


e,t 


PE, ^ G K 
PE, ^ > 


12.50% 
4.80% 


14.38% 
5.54% 


14.78% 
6.04% 


15.33% 
6.09% 


15.42% 
6.13% 
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estimator 


n = 10 


n = 40 


n = 100 


n = 1000 


n — oo 


PO 


MedkMAD 


2.3e-04 


2.7e-14 


4.8e-34 


< le-300 





91 


MedkMAD 


0.00% 


22.50% 


38.00% 


47.60% 


49.47% 


e,* 


MedkMAD 


39.03% 


46.80% 


48.40% 


49.37% 


49.47% 





estimator 


n = 10 


(7 = 40 


n = 100 


n = 1000 


n — oo 


PO 


PE 

MedkMAD 



6.4e-04 




5.5e-13 



5.6e-31 



< le-300 






qi 


PE 

MedkMAD 


25.00% 
0.00% 


25.00% 
17.50% 


25.00% 
32.00% 


25.00% 
44.20% 


25.00% 
47.56% 


K 


PE 

MedkMAD 


25.00% 
37.68% 


25.00% 
45.03% 


25.00% 
46.54% 


25.00% 
47.46% 


25.00% 
47.56% 



Table 2 po, 9l , and e* for PE and MedkMAD (k = 10) 



outliers than 8, 30, 41 1 for sample sizes n = 40, 100, 1000, we will not see a breakdown for 
MedkMAD in the unrestricted case; in the Gamma case with same shape we obtain 9, 38, 
476, and in the WeibuU 7, 32, 442; analogue figures for PE at the WeibuU with <§ = 0.7 are 
10, 25, 250. 

We may interpret the values of e„ as follows: Before having made any observations, at 
the GPD at (§ = 0.7, using PE, one may be confident to be protected against 3 outliers for 
sample size 40, 7 for sample size 100, and 65 for sample size 1000, while for MedkMAD, 
the corresponding figures are 17, 43, and 447 in the unrestricted case and 5, 12, and 118 
when restricted to <§ > 0; calculations in the Gamma and WeibuU cases give comparable 
numbers. 



5.5 Breakdown Calculations in the Remaining Cases: Simulational Approach 



For the breakdown point of MedQn and MedSn, as well as for MedkMAD in the GEVD 
case, there are no analytical expressions, so we calculate them using simulations. 
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More precisely, for each of the estimators MedkMAD {k = 10), MedQn, MedSn, PE, 
and each of the ideal distributional settings GPD, GEVD, WeibuU, and Gamma (each at 
iJ = (1,0.7)), we produced M = 10000 runs of sample sizes n = 40, 100, 1000 and noted 
the number of alterations needed to move qj^,, to q, and in a second round, starting from 
the same runs of ideal observations, for GPD and GEVD, the minimal number of alter- 
ations needed to move qk.n to qk, respectively the minimum of these two rounds. In the 
cases where explicit formulae are available this gives us a possibility to cross-check our 
results. Some small discrepancies should arise though, as we use the default median in R, 
Ir Development Core Team I (1201 ih . i.e.; (hi-med-|-lo-med)/2 for even sample size, while 
Proposition 1 5 . 3 1 below is limited to hi-med. For actual simulated values for e*, see Table[3l 



Conclusion 

This article provides a new measure for global robustness of an estimator at finite samples, 
i.e.; EFSBP, a variant of the finite sample breakdown point which is particularly useful 
in situations where we have only partial equivariance and no non-trivial, universal lower 
bounds for FSBP are available. This variant comes closer to the (sample-free) ABP while 
still retaining the finite sample aspect of FSBP. 

We have illustrated this measure at a set of scale-shape models, applying it to LD and 
Pickands/Quantile-type estimators meant for high-breakdown initial estimators to be en- 
hanced in efficiency by reweighting afterwards. 

Although kMAD, Qn, and Sn all share the same breakdown properties in the location- 
scale setting, where they ai'e defined, the corresponding LD estimators in the considered 
scale-shape models exhibit a differentiated breakdown behavior, and there is not one single 
best estimator. 

In the unrestricted GEVD case, the easy-to-compute Pickands-type estimator turned out 
to have the highest breakdown point among all considered estimators, while in the setting re- 
stricted ioE, > 0, from sample size 100, MedkMAD becomes superior. In all other situations, 
the best estimator is either MedkMAD or MedQn. In the unrestricted and restricted GPD 
case MedQn performs best, with MedkMAD close in the unrestricted case for n = 40. In the 
Weibull and Gamma cases MedkMAD performs best, except for the Weibull at « = 1000 
where MedQn is best, but with MedkMAD close by. For deciding between MedkMAD and 
MedQn in cases where their breakdown points are similar though, one also should take into 
account computational costs as well, which so far clearly favors MedkMAD. 



A Proofs 

Proof to Lemmal4Jl For any i > 0, G(m + *:so) -G(m-«o-o) = G(m + fao) > 1/2, soio > kMAD(G,/t). 
Forj:>xo and 7 ~ G, let gcC^) = me4(|y -.v|) =inf{i>0: G{s + x) -G{x- s-o) > 1/2}. But G(io + 
x) — G{x — So — o) = G{so +x) for x < m, so gcix) < ^o for ^ < "», and hence, as {x < m} C {gaM < -^o}. 
S{G) = inf{f > 0: P{ga{x) <t)> 1/2} < sq. Finally, forZ,F ~ G, stoch. indep. Q{G) = M{s: P{\X-Y\ < 
s) > 1/4} <.so, as 

P{\X-Y\<so) = j G{x + so)-G(x-SQ-o)G(dx)> 

>[ G(x + so)G{dx)> [ iGidx)>l (A.l) 

Assume i(G) = so. Incase of kMAD this happens iff G(m + iio) = 1/2, or, equivalently, G{{m;m + ksQ)) = 0. 
In case of Sn, S{G) = sq iff P{gQ{X) > s) > 1/2 for all .v < sq, or, equivalently, P{x: G((-v — .vo;JC + so)) < 
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1/2) > 1/2. But x — io <-'^o wheneverx<m, so G{(x — so;x + sq)) = G{x+so) > G(m) = 1/2. Hence S{G) = 
sq iff GiXx — sq\x + sq)) < 1/2 for all x > m, or, equivalently, iff G{x + 2sq — 0) — G(x) < 1/2 for x>xo. 
In case of Qn, S{G) = sa iff the inequalities in )A. It are equalities, i.e.; iff G([xo\ni\) = 1/2 = G{m + sq), 
and /(„,.„) G{x + .so) — G{x — .so — o) G{dx) = 0. The last integral is iff G((m;<»)) = 0, so that altogether, 
S{G)='i-oiffG(m) = G{{~}) = l/2. D 



Proof to Proposition |5T| For all models, i.e.; GPD, GEVD, Weibull, and Gamma, we can render the scale 
estimator arbitrarily large for Q3 sufficiently large, so e* < 1/4. In case of GPD and GEVD, /3 < once 
Q3 < 222. which certainly happens if, in an ideally distributed sample, we replace all observations Xj, 2Q2 < 
^i < 63 by &■ entailing )5.U . Appeahng to Lemma |5^ up to an event of probability 0(exp(— 01 )) for 
some c > 0, 

e,t = e'+Op. («-i/2+5/2j (^2) 



As 14. 4t gives valid values for ^ and j8 for any values of Q3 and Q2, in the Weibull case, we cannot lower 
the upper bound of 25%, i.e.; lim„ e* = e* = e* = 1/4. n 

Proof to Proposition l5.2l As we have seen in the considerations in Section l4!2l on producing breakdown, we 
only can solve (uniquely) for ^ and j3 as long as the quotient qt-,, falls into (qk,cjk)\ case-by-case considera- 
tions indeed show that by driving qt„ to either q^ (in case of GPD and GEVD) or qt (in all cases) produces 
breakdown, that is, breakdown could be achieved by either moving all N'^ observations from 15. 6t for which 
m < Xi < rh + Mji to {k+ \)m (entailing qk-j, ~ 1) or by moving a number of N" observations (as defined 
in 15.7) ) to the interval [(1 — qi^)m,{kqi^ + \)m\ up to the point that it contains n/2 observations (entailing 
9*;n < <ik)- The actual FSBP is then given by the alternative needing to move less observations. The terms for 
ABP follow with the usual LLN argument. D 

Proof to Proposition 15. 3 1 We start with the fact that for X, '~ F with Lebesgue density /, the joint c.d.f. of 
the order statistics Xr, 1, X|;^.„| for 1 < ('1 < h < " for * < ' can be written as 

G{s,t) = nJj{x)(;--_\)F{xY'-' "f {'X'){m-n^)f-m"-''-''dx 



^'2='2~' 



Hence 



P(K >l)= nXn„/2+l+l):„] < {k+ l)X|(„/2+i,„|) 

and )5.1U follows by taking differences. Cases 15. 9t and 15.121 follow similarly. n 

Proof to Lemmal5.4l We note that {q„ < (} = {L,T{X', < f) > na}. Hence with Hoeffding's inequality, 
iHoeffding I < 19631) , P(\q„ -q\>t„)< 2exp(-2«(F((„ + 9) - af) and (a) follows from F(r„ + q)-a = 
f{q)tn + o(f„). For (b), note that P{I„Al) < E[F(^i „) — a[\ +E|F(^2.n) ^ 0'2|- Hence, for large enough 
n, P{I„aI) < 2f{a[q[)\ai \E\q[j, — qi\ +2/(a2^2)|<^2|E|^2,n ^921- and, applying that for a random vari- 
able Z taking values in [0,1], for t e (0,1), < EZ < r + /,'P(X > f), so by Mill's ratio, P(I„Al) < 
2t + 'Ziexp(-2nt'^ f{qif) I {2ntf(qif). Plugging in t = «-'/2+^, we obtain (b). U 
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Table 3 Simulated EFSBP in % witli CLT-based 95%-confidence interval (CI) for 6 = (^ = 0.7, j3 = 1); 
number of runs is 10000 



Model 


Med- 




Med- 




Med- 












Sn 


±CI 


Qn 


±C1 


kMADio 


±C1 


PE 


±C1 




GPD ^ e M 


34.69 


0.33 


43.74 


0.09 


44.68 


0.13 


5.94 


0.10 




GPD ^ > 


8.78 


0.18 


23.44 


0.21 


10.65 


0.07 


5.94 


O.IO 




GEVD ^ e R 


6.99 


0.21 


5.89 


0.21 


13.38 


0.24 


14.85 


0.13 


1 = 40 


GEVD ^ > 


6.99 


0.21 


5.89 


0.21 


4.75 


0.13 


7.87 


0.16 




Weibull 


37.63 


0.34 


40.32 


0.11 


47.31 


0.02 


25.00* 


0.00* 




Gamma 


34.55 


0.32 


41.97 


0.10 


49.17 


0.02 


n.a. 


- 




GPD ^ G K 


23.55 


0.21 


47.51 


0.04 


44.73 


0.09 


6.12 


0.07 




GPD ^ > 


12.44 


0.16 


18.42 


0.16 


11.32 


0.05 


6.12 


0.07 




GEVD ^ e R 


3.25 


0.09 


2.88 


0.09 


8.86 


0.14 


15.01 


0.09 

n 


: = 100 


GEVD ^ > 


3.25 


0.09 


2.88 


0.09 


6.32 


0.11 


6.71 


0.05 




Weibull 


26.58 


0.30 


45.12 


0.05 


47.41 


0.02 


25.00* 


0.00* 




Gamma 


25.42 


0.21 


45.90 


0.04 


49.35 


0.02 


n.a. 


- 




GPD ^ G M 


21.86 


0.03 


49.75 


0.00 


44.75 


0.03 


6.38 


0.03 




GPD ^ > 


14.99 


0.13 


16.06 


0.02 


11.82 


0.02 


6.37 


0.03 




GEVD ^ G R 


1.06 


0.03 


1.27 


0.03 


7.25 


0.05 


15.39 


0.04 

n 


= 1000 


GEVD ^ > 


1.06 


0.03 


1.27 


0.03 


7.22 


0.05 


6.20 


0.08 




Weibull 


19.77 


0.03 


49.01 


0.01 


47.55 


0.01 


25.00* 


0.00* 




Gamma 


24.13 


0.04 


49.16 


0.01 


49.46 


0.01 


n.a. 


- 





* : theoretical values, 

n.a.: not available; in these cases, 25% is an upper bound 
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q(y= 



kMADi(^, 1) 
median(^, 1) 



q(« = 



kMAD,„(^, 1) 
median{^. 1) 



q(a= 



QnK.1) 
median{^, 1) 



q(y= 



SnfeD 
median{^. 1) 
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Fig. 4 Quotients kMAD(^,k = l)/med(^) and kMAD(^,k = 10)/med(^), Qn(^)/med(^) and 
Sn(§)/med(i^) as functions in (^; we also include with respective q, q 



